Intelligent Automatic Configuration Parameter Tuning for Big Data Processing Platforms Using Machine Learning, Deep Learning, and Reinforcement Learning
Author Details
Journal Details
Published
Published: 12 December 2023 | Article Type : Research ArticleAbstract
The exponential growth of big data processing has necessitated efficient and intelligent parameter tuning mechanisms for distributed computing platforms such as Apache Hadoop and Apache Spark. Manual configuration optimization remains time-consuming and inefficient, while existing auto-tuning methods introduce unacceptable overhead (20-30% of job execution time). This paper presents a comprehensive intelligent online parameter tuning framework that strategically integrates Singular Value Decomposition (SVD) with collaborative filtering, deep learning neural networks (CNN-based feature extraction), stochastic gradient descent optimization, and reinforcement learning algorithms to automatically optimize critical Hadoop/Spark configuration parameters. The proposed framework incorporates three primary components: (1) a configuration repository generator using genetic algorithms and evolutionary computation, (2) a machine learning-based intelligent recommendation engine implementing SVD-based collaborative filtering with deep learning augmentation, and (3) an online adaptive learning module with reinforcement learning adaptation for dynamic cluster conditions. Comprehensive experimental evaluation conducted on a 4-node Hadoop 3.3.0 cluster demonstrates that our approach achieves performance improvements of 24.2% over default configurations while maintaining mean percentage error (MPE) of only 14.32% from theoretically optimal configurations. The framework reduces parameter optimization recommendation time by 88.3% (from 180 seconds to 21 seconds), achieves 13% average memory utilization improvement, and demonstrates robust scalability across diverse workloads (WordCount, Sort operations) with dataset sizes ranging from 1 GB to 16 GB.
Keywords: Big Data, Parameter Tuning, Collaborative Filtering, Singular Value Decomposition, Machine Learning, Deep Learning, Reinforcement Learning, Hadoop, Apache Spark, Distributed Computing, Online Learning.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Copyright © Author(s) retain the copyright of this article.
Statistics
1 Views
5 Downloads
Volume & Issue
Article Type
Research Article
How to Cite
Citation:
Naga Charan Nandigama. (2023-12-12). "Intelligent Automatic Configuration Parameter Tuning for Big Data Processing Platforms Using Machine Learning, Deep Learning, and Reinforcement Learning." *Volume 6*, 2, 9-19